Compact and Fast Algorithms for Regular Expression Search

نویسنده

  • T. Paranthoën
چکیده

This paper describes an improvement of the brute force determinization algorithm in the case of homogeneous NFAs, as well as its application to pattern matching. Brute force determinization with limited memory may provide a partially determinized automaton, but its bounded complexity makes it be a fail-safe procedure contrary to the classical subset construction. We investigate the particular case of pattern matching based on Glushkov NFAs, which involves the determinization of an almost homogeneous automaton. Actually, our algorithm is inspired both by recent results of Champarnaud concerning the subset automaton of a homogeneous NFA and by the algorithm recently designed by Navarro and Raffinot to implement the brute force determinization of the Glushkov NFA of a regular pattern. We obtain an algorithm with an average exponentially smaller memory requirement for a given level of determinization, which, considering a bounded memory, implies a quadratically smaller parsing time.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SECURING INTERPRETABILITY OF FUZZY MODELS FOR MODELING NONLINEAR MIMO SYSTEMS USING A HYBRID OF EVOLUTIONARY ALGORITHMS

In this study, a Multi-Objective Genetic Algorithm (MOGA) is utilized to extract interpretable and compact fuzzy rule bases for modeling nonlinear Multi-input Multi-output (MIMO) systems. In the process of non- linear system identi cation, structure selection, parameter estimation, model performance and model validation are important objectives. Furthermore, se- curing low-level and high-level ...

متن کامل

A Differential Evolution and Spatial Distribution based Local Search for Training Fuzzy Wavelet Neural Network

Abstract   Many parameter-tuning algorithms have been proposed for training Fuzzy Wavelet Neural Networks (FWNNs). Absence of appropriate structure, convergence to local optima and low speed in learning algorithms are deficiencies of FWNNs in previous studies. In this paper, a Memetic Algorithm (MA) is introduced to train FWNN for addressing aforementioned learning lacks. Differential Evolution...

متن کامل

Fast and compact regular expression matching

We study 4 problems in string matching, namely, regular expression matching, approximate regular expression matching, string edit distance, and subsequence indexing, on a standard word RAM model of computation that allows logarithmic-sized words to be manipulated in constant time. We show how to improve the space and/or remove a dependency on the alphabet size for each problem using either an i...

متن کامل

Fast Motif Search in Protein Sequence Databases

Regular expression pattern matching is widely used in computational biology. Searching through a database of sequences for a motif (a simple regular expression), or its variations is an important interactive process which requires fast motif-matching algorithms. In this paper, we explore and evaluate various representations of the database of sequences using suffix trees for two types of query ...

متن کامل

Practical Private Regular Expression Matching

Regular expressions are a frequently used tool to search in large texts. They provide the ability to compare against a structured pattern that can match many text strings and are common to many applications, even programming languages. This paper extends the problem to the private two-party setting where one party has the text string and the other party has the regular expression. The privacy c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004